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Here,  we  apply  the  harmonic  Fourier  beads  (HFB)  path  optimization  method  to  study  chemical  re¬ 
actions  involving  covalent  bond  breaking  and  forming  on  quantum  mechanical  (QM)  and  hybrid 
QM/molecular  mechanical  (QM/MM)  potential  energy  surfaces.  To  improve  efficiency  of  the  path 
optimization  on  such  computationally  demanding  potentials,  we  combined  HFB  with  conjugate  gra¬ 
dient  (CG)  optimization.  The  combined  CG-HFB  method  was  used  to  study  two  biologically  rele¬ 
vant  reactions,  namely,  L-  to  D-alanine  amino  acid  inversion  and  alcohol  acylation  by  amides.  The 
optimized  paths  revealed  several  unexpected  reaction  steps  in  the  gas  phase.  For  example,  on  the 
B3LYP/6-31G(d,p)  potential,  we  found  that  alanine  inversion  proceeded  via  previously  unknown 
intermediates,  2-iminopropane-l,l-diol  and  3-amino-3-methyloxiran-2-ol.  The  CG-HFB  method  ac¬ 
curately  located  transition  states,  aiding  in  the  interpretation  of  complex  reaction  mechanisms.  Thus, 
on  the  B3LYP/6-31G(d,p)  potential,  the  gas  phase  activation  barriers  for  the  inversion  and  acylation 
reactions  were  50.5  and  39.9  kcal/mol,  respectively.  These  barriers  determine  the  spontaneous  loss  of 
amino  acid  chirality  and  cleavage  of  peptide  bonds  in  proteins.  We  conclude  that  the  combined  CG- 
HFB  method  further  advances  QM  and  QM/MM  studies  of  reaction  mechanisms.  ©  2013  Author(s). 
All  article  content,  except  where  otherwise  noted,  is  licensed  under  a  Creative  Commons  Attribution 
3.0  Unported  License.  [http;//dx.doi.org/10. 1063/1. 4826470] 


I.  INTRODUCTION 

Our  understanding  of  complex  chemical  transformations 
relies  on  the  identification  of  elementary  reactions  involved 
in  the  process:  the  reaction  mechanism.  Each  elementary  re¬ 
action  involves  a  reactant,  a  product,  and  a  connecting  transi¬ 
tion  state.  Computational  chemistry  maps  these  reactions  onto 
potential  energy  surfaces,  providing  both  structures  and  ener¬ 
getics  for  analysis.  The  mapping  relies  on  energy  optimization 
to  identify  a  reaction  path  that  starts  at  the  reactant,  passes 
through  the  transition  state,  and  ends  at  the  product. 

Transition  states  (first-order  saddle  points  on  poten¬ 
tial  energy  surfaces)  determine  the  rates  of  elementary 
reactions'’^  but  are  hard  to  find.^  The  search  for  transition 
states  requires  specialized  optimization  techniques  that  climb 
potential  energy  ridges  and  succeed  only  when  the  starting 
structures  are  sufficiently  close  to  the  transition  states."'"®  Fur¬ 
thermore,  slight  differences  in  starting  structures  can  lead  to 
different,  often  irrelevant  transition  states. 

In  contrast  to  transition  states,  reactant,  product,  and  in¬ 
termediate  states  (local  minima  on  potential  energy  surfaces) 
do  not  require  specialized  optimization  techniques.  The  local 
minima  can  be  efficiently  located  using  conventional  downhill 
energy  minimization  techniques.^’ 

Before  the  reactant,  transition,  and  product  states  can  be 
used  to  represent  a  reaction  path  on  a  corresponding  three- 
point  energy  diagram,  the  transition  state  must  be  validated. 
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First,  the  gradient  of  the  potential  at  the  transition  state  must 
be  zero,  just  like  at  a  local  minimum.  Second,  a  separate  nor¬ 
mal  mode  calculation  must  confirm  that  the  transition  state 
has  a  single  imaginary  frequency,  i.e.,  is  a  first-order  saddle 
point.  The  magnitude  of  the  imaginary  frequency  is  represen¬ 
tative  of  the  potential  energy  curvature  at  the  transition  state. 
Displacements  along  the  normal  mode  vector  with  imaginary 
frequency  should  move  key  atoms  in  the  direction  of  either 
the  reactant  or  the  product.  Ideally,  an  additional  calculation, 
namely,  the  intrinsic  reaction  coordinate  (IRC),""'^  should  be 
performed  to  ensure  that  the  transition  state  connects  the  re¬ 
actant  with  the  product  and  not  with  any  other  unexpected  in¬ 
termediate  state.  Both  the  normal  mode  and  IRC  calculations 
become  prohibitively  expensive  as  the  system  size  grows, 
necessitating  other  means  of  transition  state  validation. 

Because  finding  the  correct  transition  states  using  spe¬ 
cialized  optimization  techniques  is  fraught  with  challenges, 
the  process  becomes  more  of  an  art  than  science,  hindering 
studies  of  reaction  mechanisms.  The  situation  could  be  im¬ 
proved  with  the  introduction  of  computational  tools  that  ro¬ 
bustly  pinpoint  correct  transition  states  with  minimal  user  in¬ 
put.  We  think  that  full  reaction  path  optimization  is  one  of 
the  best  solutions  to  this  problem.  In  contrast  to  optimization 
techniques  that  operate  on  a  single  state,  reaction  path  opti¬ 
mization  techniques  operate  on  a  “chain  of  states”  that  con¬ 
nect  the  reactant  and  product  states.^""'’  This  connection 
between  the  states  in  the  path  should  be  sufficient  to  accu¬ 
rately  and  robustly  pinpoint  the  transition  states  using  simple 
energy  minimizations. 


'S)  Q 


0021  -9606/201 3/1 39(1 6)/1 651 04/1 0 


139,  165104-1 


©Author(s)  2013 


Report  Documentation  Page 

Form  Approved 

0MB  No.  0704-0188 

Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  0MB  control  number. 

1.  REPORT  DATE 

^  REPORT  TYPE 

3.  DATES  COVERED 

00-00-2013  to  00-00-2013 

4.  TITLE  AND  SUBTITLE 

Exploring  chemical  reaction  mechanisms  through  harmonic  Fourier 
heads  path  optimization 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

US  Army  Medical  Research  and  Materiel  Command, DoD  Biotechnology 
High  Performance  Computing  Software  Applications 

Institute, Telemedicine  and  Advanced  Technology  Research  Center, Fort 
Detrick, MD, 21702 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR’S  ACRONYM(S) 

11.  SPONSOR/MONITOR’S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distrihution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

15.  SUBJECT  TERMS 

16.  SECURITY  CLASSIFICATION  OF:  17.  LIMITATION  OF 

_ _ _  ABSTRACT 

18.  NUMBER  19a.  NAME  OF 

OF  PAGES  RESPONSIBLE  PERSON 

a.  REPORT  b.  ABSTRACT  c.  THIS  PAGE  Same  aS 

unclassified  unclassified  unclassified  Report  (SAR) 

10 

Standard  Form  298  (Rev.  8-98} 

Prescribed  by  ANSI  Std  Z39-18 


165104-2 


Khavrutskii,  Smith,  and  Wallqvist 


J.  Chem.  Phys.  139,  165104  (2013) 


Here,  to  study  chemical  reactions  we  use  an  approach 
that  can  simultaneously  locate  and  validate  transition  states 
in  a  robust  manner.  The  approach  is  based  on  the  reaction 
path  optimization  method,  called  harmonic  Fourier  beads 
The  HFB  method  builds  on  such  popular  path 
optimization  methods  as  the  nudged  elastic  band  (NEB)^’^^ 
and  string  methods."^’ Unlike  NEB  and  string 
method,  HEB  uses  the  Eourier  parametrization  of  the  path  pi¬ 
oneered  by  Doll,  Freeman,  and  their  co-workers.^"’^'  Origi¬ 
nally,  the  HFB  method  was  used  with  molecular  mechanical 
(MM)  potentials  that  did  not  allow  covalent  bond  breaking  or 
forming. The  present  work  extends  the  applicability 
of  the  HFB  method  to  chemical  reactions  involving  covalent 
bond  breaking  and  forming  on  quantum  mechanical  (QM)  and 
hybrid  QM/MM  potentials. 

Because  QM  and  QM/MM  potentials  are  computation¬ 
ally  more  demanding  than  MM  potentials,  we  combined  the 
HFB  method  with  the  conjugate  gradient  (CG)  optimizer^^ 
for  added  efficiency.  The  combination  is,  thus,  referred  to  as 
CG-HFB.  Almost  two  decades  ago,  Doll,  Freeman,  and  Ma- 
tro  devised  a  method  to  search  for  transition  states  using  the 
Fourier  parametrization  of  the  path  with  CG  optimizer.^®’^'  In 
particular,  they  attempted  to  locate  transition  states  by  opti¬ 
mizing  the  Fourier  amplitudes  of  the  path  -  an  approach  that, 
as  the  authors  acknowledged,  was  not  entirely  reliable.^'  In 
contrast,  the  CG-HFB  optimizes  the  Cartesian  coordinates  of 
the  snapshots  or  beads  along  the  path  and  uses  the  Fourier 
parametrization  to  string  them.  Furthermore,  CG-HFB  uses 
adaptive  harmonic  restraints  during  optimization  to  prevent 
the  beads  from  collapsing  into  the  nearest  minima.  These  re¬ 
straints  transform  the  rugged  energy  landscapes  into  simpler, 
funnel-like  landscapes,^®’ which  simplify  the  optimization 
of  all  the  beads  and  make  the  CG-HFB  method  more  reliable. 

With  the  help  of  the  combined  CG-HFB  method  we 
studied  mechanisms  of  two  biologically  relevant  reactions. 
Specifically,  we  studied  monomolecular  L-  to  D-alanine 
amino  acid  inversion  and  bimolecular  alcohol  acylation  by 
amides.  All  reactions  were  studied  in  the  gas  phase  using 
Hartree-Fock  self-consistent  field  (HF-SCF)^*  and  a  Becke 
three-parameter  hybrid  functional  combined  with  Lee- Young- 
Parr  correlation  functional  (B3LYP)^®“^^  with  6-31G(d,p)"^^ 
basis  set.  CG-HFB  has  successfully  identified  multiple  re¬ 
action  paths  and  located  corresponding  transition  states.  Im¬ 
portantly,  CG-HFB  required  only  minimal  initial  preparation 
and  a  few  trivial  adjustments  over  the  course  of  path  opti¬ 
mizations,  thus  significantly  simplifying  the  search  for  and 
validation  of  transition  states. 

II.  METHODS 

A.  Brief  overview  of  the  CG-HFB  method 

The  key  ideas  and  features  of  the  combined  CG- 
HFB  implementation  and  its  differences  from  ear¬ 
lier  HFB  implementations'®"^^’ and  other  related 
methods^’ are  provided  in  the  supplementary 
material.®^ 

In  brief,  CG-HFB  performs  optimization  in  Carte¬ 
sian  coordinates,  which  avoids  costly  transformations  and 


complicated  corrections  associated  with  internal  coordi¬ 
nates.  The  CG  optimization  is  performed  on  the  landscape 
that  is  modified  by  adaptive  positional  harmonic  restraints, 
which  simplify  the  optimization  and  prevent  abrupt  structural 
changes  along  the  path.  Optionally,  the  CG  optimizations  can 
be  performed  orthogonally  to  the  path.  The  CG  optimizations 
in  Cartesian  coordinates  require  the  least  amount  of  storage 
and  make  the  CG-HFB  method  particularly  suited  for  large 
systems. 

Each  CG-HEB  path  optimization  step  begins  with  gener¬ 
ating  coordinates  of  equidistant  beads.  With  the  exception  of 
the  initial  step,  the  equidistant  coordinates  are  obtained  from 
the  coordinates  of  the  beads  optimized  at  the  previous  step 
with  the  help  of  the  Eourier  parametrization.  The  equidistant 
coordinates  are  used  as  the  new  reference  structures  or  an¬ 
chors  of  the  harmonic  restraints.  The  CG  optimizations  of 
the  harmonically  restrained  beads  are  then  performed.  The 
optimization  step  ends  when  all  of  the  CG  optimizations  of 
the  harmonically  restrained  beads  finish.  The  CG  optimiza¬ 
tions  of  the  beads  between  the  path  optimization  steps  are 
independent  and,  therefore,  can  run  in  parallel  for  speed. 

Importantly,  CG-HFB  can  extract  transition  states  from 
optimized  paths  using  continuous  energy,  gradient,  and  coor¬ 
dinate  path  curves.  The  transition  states  are  expected  to  be 
accurate  when  the  continuous  energy  profiles  reconstructed 
from  the  gradients  at  the  beads  match  the  actual  energies  of 
the  beads  along  the  path.  To  fulfill  this  condition,  additional 
beads  can  be  inserted  into  the  path.  The  CG-HFB  optimized 
paths  automatically  satisfy  the  IRC  condition""'®  and,  hence, 
validate  the  extracted  transition  states. 

B.  L-  to  D-alanine  inversion 

To  study  the  inversion  of  an  asymmetric  a-carbon  of  ala¬ 
nine  in  the  gas  phase,  we  used  the  neutral  alanine  molecule 
shown  in  Fig.  1. 

1.  Generating  initial  paths 

First,  we  generated  the  coordinates  of  L-alanine,  the  re¬ 
actant.  The  reactant  was  optimized  to  the  nearest  local  mini¬ 
mum  using  single-bead  HFB  optimization  with  soft  harmonic 
restraints  on  all  the  atoms.  The  coordinates  of  the  optimized 
L-alanine  were  then  inverted  to  yield  D-alanine,  the  product. 

To  explore  alanine  inversion,  we  prepared  three  crude 
intermediates  that  would  bias  the  reaction  paths  to  go  via 
three  different  routes.  In  particular,  by  displacing  the  Hq,  pro¬ 
ton  along  the  covalent  bonds  connecting  the  Cq,  carbon  atom 
to  the  three  chemical  groups,  namely,  -NH2,  -C(0)OH,  and 


FIG.  1.  Alanine  inversion.  The  Hq,  proton  that  is  involved  in  the  alanine 
inversion  is  shown  in  bold  green. 
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-CH3,  we  created  initial  coordinates  for  the  intermediates. 
The  coordinates  of  each  intermediate  were  then  inserted  into 
the  input  xyz  file  (refer  to  supplementary  material®'^  for  de¬ 
tails)  between  the  reactant  and  product  coordinates  and  passed 
to  CG-HFB  to  generate  three  initial  paths. 


2.  Optimization  of  the  paths 

Path  optimizations  began  at  the  HF/6-31G  level  of  theory 
and  progressed  through  HF/6-31G(d)  to  F[F/6-31G(d,p)  and 
finally  to  B3LYP/6-31G(d,p).  The  bulk  of  optimization  was 
performed  using  24  beads/path.  If  necessary,  the  number  of 
beads  in  each  path  was  increased  to  obtain  accurate  integral 
energy  profiles  and  transition  states. 

Paths  with  more  than  one  transition  state  were  split  into 
elementary  segments,  each  containing  a  single  transition  state. 
Segments  were  then  optimized  separately.  The  force  constants 
and  numbers  of  beads  could  be  adjusted  for  each  segment 
according  to  the  transition  state  energy  and  curvature.  Ul¬ 
timately,  this  flexibility  made  the  optimizations  more  effi¬ 
cient  and  allowed  for  improved  integral  energy  profiles  and 
transition  state  structures. 

Transition  states  extracted  from  the  CG-FIFB  optimized 
paths  were  used  as  initial  guesses  to  obtain  what  we  call 
“exact  transition  states.”  The  exact  transition  states  were  ob¬ 
tained  using  the  best  transition  state  optimization  method^’® 
available  in  GAUSSIAN09"^^  with  the  tight  convergence  cri¬ 
teria.  Typically,  the  optimizations  to  get  the  exact  transition 
states  began  with  computing  analytical  Hessians  and  then 
continued  with  reduced  maximum  step  size  of  0.015  A. 
The  majority  of  the  optimizations  completed  in  less  than 
10  steps. 


C.  Acylation  by  peptide 

We  studied  the  acylation  reaction  with  the  CG-HFB 
method  using  both  QM  and  QM/MM  potentials.  Two  systems, 
namely,  “small”  and  “large”  were  considered  as  shown  in 
Fig.  2.  The  small  system  comprised  methanol  and  for- 
mamide.  The  large  system  substituted  A-methylacetamide  for 
formamide. 


1.  ONiOM  QM/MM  setup 

The  large  system  was  partitioned  such  that  the  two 
methyl  groups  of  A-methylacetamide  were  in  the  MM  level 
(see  Fig.  2).  With  this  partitioning,  the  QM  layer  of  the  large 
system  in  ONIOM  QM/MM  is  identical  to  that  of  the  small 
system. 

Any  ONIOM  QM/MM  reaction  path  calculation  depends 
somewhat  on  the  MM  parameters  that  span  the  QM  region. 
This  is  why  careful  partitioning  of  the  system  between  QM 
and  MM  layers  is  crucial.  A  problematic  partitioning  scheme 
would  terminate  the  QM  layer  too  close  to  a  covalent  bond 
being  broken,  formed,  or  altered  in  a  way  that  reduces  its 
bond  order.^*  MM  atoms  up  to  two  bonds  away  from  the  af¬ 
fected  bond  are  influenced  by  the  MM  dihedral  terms  involv¬ 
ing  the  affected  bond.  Moreover,  atoms  one  bond  away  from 
the  affected  bond  are  involved  in  the  corresponding  MM  angle 
bending  terms.  None  of  these  terms  can  be  properly  cancelled 
by  the  ONIOM  QM/MM  scheme."^*  Whenever  possible,  it  is 
best  to  ensure  that  no  purely  MM  atom  is  influenced  by  the 
affected  bonds. 

Due  to  the  small  size  of  the  system  being  studied  in  this 
work,  we  could  not  avoid  the  partitioning  issues"^*  described 
above  and  needed  a  compromise.  Because  our  goal  was  to 
assess  the  utility  of  the  CG-HFB  method  for  QM/MM  appli¬ 
cations,  we  favored  simplicity  over  accuracy  in  the  ONIOM 
QM/MM  calculations.  To  remove  spurious  dihedral  and  angu¬ 
lar  MM  terms  associated  with  the  modified  bonds,  we  simply 
removed  such  bonds  from  the  MM  connectivity  tables.  To  as¬ 
sess  the  effect  of  the  atom  types  and  of  the  vdW  parameters 
on  energy  profiles,  we  performed  path  optimizations  using 
both  the  reactant  state,  i.e.,  methanol  and  A-methylacetamlde, 
and  the  product  state,  i.e.,  the  methyl  ester  of  acetic  acid 
and  A-methylamine.  In  these  calculations,  we  allowed  the 
partial  charges  to  be  computed  on  the  fly  with  the  charge 
equilibration  method."^® 


2.  Generating  initiai  paths 

To  initialize  reaction  path  optimization,  we  prepared 
several  intermediate  states  using  single-bead  optimizations 
with  added  distance  restraints  (see  supplementary  material®"^). 


Alcohol  +  Peptide  Ester  +  Amine 


FIG.  2.  Schematic  representation  of  alcohol  acylation  by  a  peptide.  Covalent  bonds  involved  in  the  reaction  are  shown  in  bold.  The  hydrogen  of  the  alcohol 
that  transfers  to  the  amide  is  shown  in  bold  green.  The  methyl  groups  that  are  partitioned  into  the  MM  layer  of  “our  own  A^-layer  integrated  molecular  orbital 
MM”  method  [ONIOM(QM:MM)]  calculation  are  circled.  The  reactant  and  product  as  drawn  correspond  to  the  larger  model.  The  small  system  was  derived 
from  the  large  system  by  substituting  the  circled  methyl  groups  for  a  hydrogen  atom. 
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L-Alanine  D-Alanine 


2-Amino-1-hydroxyprop-1-en-1-olate  2-Amino-1-hydroxyprop-1-en-1-olate 

FIG.  3.  Key  steps  of  the  Ca-NH2  path.  The  Hq.  proton  that  is  involved  in  the 
alanine  inversion  is  sho\vn  in  bold  green. 

Coordinates  of  reactant  and  product  states  for  bimolecular  re¬ 
actions  were  obtained  by  optimization  on  the  corresponding 
sides  of  the  reaction  barrier. 

3.  CG-HFB  optimization  of  the  paths 

Path  optimizations  of  the  large  system  began  at  the  HF/ 
6-3 IG  level  of  theory  and  progressed  through  HF/6-31G(d) 
to  HF/6-31G(d,p)  and  finally  to  B3LYP/6-31G(d,p).  For  the 
small  system  and  for  the  large  system  treated  with  ONIOM 
QM/MM,  the  highest  level  of  QM  used  was  HF/6-31G(d,p). 
Optimizations  of  the  paths  were  performed  starting  with  24 
beads/path.  The  number  of  beads  was  gradually  increased  to 
provide  accurate  integral  energy  profiles  and  transition  states. 
The  exact  transition  states  were  obtained  as  described  for  the 
alanine  inversion. 

III.  RESULTS  AND  DISCUSSION 

In  this  work,  we  explored  two  biologically  relevant  model 
reactions.  First,  we  studied  the  inversion  of  the  asymmetric 
carbon  of  alanine  a-amino  acid,  converting  it  from  the  L- 
stereoisomer  to  the  D-stereoisomer.  Second,  we  studied  the 
one-step  acylation  of  an  alcohol  by  an  amide,  a  peptide  mimic. 
Both  reactions  require  QM  description  as  they  involve  cova¬ 
lent  bond  breaking  and  forming.  The  barrier  for  alanine  in¬ 
version  provides  the  upper  limit  to  spontaneous  racemization 
of  a-amino  acids  that  comprise  all  known  life  forms.  On  the 


other  hand,  the  barrier  for  the  alcohol  acylation  serves  as  the 
limit  to  the  spontaneous  acylation  that  would  disrupt  peptide 
bonding  between  amino  acids  in  proteins. 

A.  L-  to  D-alanine  inversion 

To  study  the  inversion  of  an  asymmetric  a -carbon  of  ala¬ 
nine  in  the  gas  phase,  we  used  the  neutral  alanine  molecule 
shown  in  Fig.  1.  Technical  details  of  path  preparation  and 
optimization  are  provided  in  Sec.  II.  Here,  we  provide  ener¬ 
getics  at  the  B3LYP/6-31G(d,p)  level  of  theory.  The  supple¬ 
mentary  material®^  also  provides  results  at  the  HF/6-31G(d,p) 
level  of  theory.  Although  the  HF/6-31G(d,p)  results  differed 
from  B3LYP/6-31G(d,p)  results  quantitatively  and,  some¬ 
times,  qualitatively,  they  helped  identify  intermediates  that  we 
would  have  missed  otherwise. 


1.  The  Ca-NHz  path 

The  optimized  Ca,-NH2  path  comprises  six  segments,  fea¬ 
turing  seven  local  minima  and  six  transition  states.  The  key 
steps  of  the  path  are  shown  in  Fig.  3.  The  energetics  and 
other  parameters  of  the  actual  path  are  shown  in  Table  I  and 
Fig.  4.  The  path  identified  three  different  conformations  of 
alanine,  namely,  Sl/R,  Sl/P  (S2/R),  and  S5/P  (S6/R).  Here, 
Sl/R  and  Sl/P  refer  to  the  reactant  and  product  of  the  first 
segment,  respectively.  Interestingly,  the  Sl/P  conformation 
was  1.8  kcal/mol  lower  in  energy  than  the  starting  Sl/R  con¬ 
formation.  Accidentally,  the  Sl/P  conformation  corresponds 
to  the  global  minimum  of  alanine  in  the  gas  phase  that  was 
the  subject  of  earlier  theoretical^°“^^  and  experimental^^’ 
studies. 

The  Ca,-NH2  inversion  began  with  -NH2  group  rotation 
into  the  lowest  energy  conformation  of  alanine.  The  barrier 
for  the  rotation  was  0.3  kcal/mol  in  the  forward  direction. 
Next,  the  Hq.  proton  transferred  to  the  nitrogen  of  the  -NH2 
group,  turning  it  into  positively  charged  -NH3+.  The  barrier 
for  the  proton  transfer  was  the  highest  energy  point  along  the 
path  at  65.7  kcal/mol  from  the  lowest  energy  conformation, 
Sl/P. 

The  six  atoms  of  the  resulting  meta-stable  intermediate 
S2/P  lie  in  a  prochiral  plane  with  the  sp^  Cq  atom  in  the 


TABLE  1.  Summary  of  the  Ca-NH2  path  for  alanine  inversion.^ 


Path  segment 

Er 

(kcal/mol) 

Ep 

(kcal/mol) 

Ets 

(kcal/mol) 

^EhfB-TS 

(kcal/mol) 

Imag  Ereqps 
(AFreqHFB-Ts)  (cm~') 

TS-HFB  RMSDau 
(RMSD^HMahyl)  (A) 

k 

(kcal/mol/ A^) 

M{K) 

SI 

0.0 

-1.8 

0.3 

0.00 

168  (0) 

0.002  (0.002) 

70 

23(12) 

S2 

-  1.8 

32.9 

63.8 

0.02 

1627  (-4) 

0.018(0.014) 

1400 

64  (64) 

S3 

32.9 

32.9 

39.4 

0.00 

300  (-  1) 

0.002  (0.001) 

70 

31  (16) 

S4 

32.9 

-  1.8 

63.8 

0.02 

1627  (-4) 

0.022  (0.017) 

1400 

64  (64) 

S5 

-  1.8 

-0.2 

0.2 

0.00 

177  (-  1) 

0.003  (0.003) 

70 

17(9) 

S6 

-0.2 

0.0 

4.0 

0.08 

653  (1) 

0.045  (0.044) 

500 

20(10) 

Ep,  and  Eps  are  B3LYP/6-31G(d,p)  energy  values  relative  to  the  original  conformation  of  alanine  (—323.754250  Hailree)  of  the  exact  reactant,  product,  and  transition  state  (TS), 
respectively.  AEhfb-ts  is  the  energy  difference  between  the  TS  extracted  from  the  conjugate  gradient-harmonic  Fourier  beads  (CG-HFB)-optimized  path  and  the  exact  TS.  The  force 
constant  (k)  is  different  for  different  segments.  The  M  (K)  column  shows  the  total  number  of  beads  (M)  with  the  number  of  Fourier  terms  in  the  series  (K)  used  in  the  final  optimized 
path  in  parentheses.  All  segments  were  optimized  without  projecting  the  forces.  The  Imag  Freqps  column  shows  the  exact  imaginary  frequency  values  for  the  TSs  and  their  differences 
with  CG-HFB  values  in  parentheses.  The  TS-HFB  RMSD/^u  column  shows  all-atom  root  mean  square  values,  with  RMSD.fimerhvi  values  excluding  the  hydrogen  atoms  of  the  methyl 
groups  in  parentheses. 
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FIG.  4.  Energy  profile  for  the  full  Cq.-NH2  path.  The  integral  energy  profile 
reconstructed  from  the  forces  is  shown  in  solid  blue.  The  actual  energy  points 
offset  to  match  the  integral  profile  are  shown  as  red  plus  signs.  For  clarity,  the 
elementary  segments  are  separated  by  vertical  lines.  Each  segment  contains  a 
single  transition  state  (TS)  marked  by  a  black  “+”  and  covers  one  unit  of  the 
progress  variable. 


3-Amino-3-methyloxiran-2-ol 


FIG.  5.  Key  steps  of  Ca-C(0)0H  path  1.  The  Hq,  proton  that  is  involved  in 
the  alanine  inversion  is  shown  in  bold  green. 


center.  The  planar  configuration  was  stabilized  by  an  allylic- 
like  resonance  with  the  carbonyl  oxygen  of  the  -C(0)0H 
group,  which  delocalized  the  negative  charge  from  the 
atom.  The  metastable  intermediate  S2/P  was  34.7  kcal/mol 
above  the  lowest  energy  conformation  of  alanine. 

Subsequently,  the  -NH3+  group  rotated  followed  by  the 
symmetric  proton  transfer  back  to  the  Ca  atom  on  the  other 
side  of  the  prochiral  plane.  Rotation  of  the  -NH3+  group  in 
the  prochiral,  meta-stable  intermediate  had  a  symmetric  bar¬ 
rier  of  6.5  kcal/mol.  In  principle,  the  -NH3+  rotation  is  not 
required  as  the  three  hydrogen  atoms  are  equivalent.  How¬ 
ever,  by  construction,  our  path  follows  a  particular  hydrogen 
atom  all  the  way  through  the  inversion.  The  barrier  for  the  Hq, 
proton  transfer  back  to  the  €„  atom  on  the  other  side  of  the 
prochiral  meta-stable  intermediate  was  31.0  kcal/mol. 

The  final  stage  of  the  Cq;-NH2  inversion  could  be  as  sim¬ 
ple  as  the  mirror  image  of  the  initial  stage,  i.e.,  rotation  of 
the  -NH2  group.  However,  our  optimized  path  concluded  dif¬ 
ferently.  Specifically,  starting  from  the  lowest  energy  confor¬ 
mation  of  D-alanine,  namely,  S4/P,  the  -NH2  group  rotated 
into  yet  another  conformation  of  alanine,  S5/P,  with  an  en¬ 
ergy  of  1.6  kcal/mol  and  a  forward  barrier  of  2.0  kcal/mol. 
From  this  intermediate,  the  -NH2  group  inverted  into  the 
final  product  configuration,  S6/P,  with  a  forward  barrier  of 
4.2  kcal/mol.  The  -NH2  inversion  brought  the  transformation 
to  a  close  with  the  final  product,  S6/P,  precisely  mirroring  the 
S 1/R  conformation  of  the  reactant  L-alanine. 


2.  Ca-C(0)OH paths 

The  exploratory  Ca-C(0)OH  path  surpassed  the  Ca,-NH2 
path  in  complexity  and  barrier  heights.  Despite  the  higher  bar¬ 
riers  in  certain  segments  of  the  exploratory  Ca.-C(0)OH  path, 
our  analysis  identified  an  important  segment  with  a  relatively 
low  barrier.  Specifically,  the  path  included  a  proton  transfer 
along  the  Cc,-C(0)OH  bond  with  a  barrier  of  50.5  kcal/mol. 
After  the  transfer,  the  path  detoured  through  much  higher 


energy  regions.  Still,  this  finding  suggested  that  a  lower  en¬ 
ergy  Ca-C(0)OH  path  could  exist.  Therefore,  we  rationally 
redesigned  the  Cq.-C(0)OH  path  with  the  hope  of  finding  a 
route  with  barrier  heights  below  65.7  kcal/mol. 

a.  Ha  proton  transfer  to  the  —C(0)OH  group.  The  Hq. 
proton  transfer  to  the  -C(0)OH  group  transformed  the  car¬ 
boxyl  group  into  a  geminal  diol  -C(OH)2H  group,  as  shown  in 
Fig.  5.  Even  when  forced  to  start  from  the  lowest  energy 
conformation  of  alanine,  the  reaction  proceeded  through  the 
original  conformation. 

Interestingly,  the  Hq.  proton  transfer  concluded  with  a 
secondary  proton  transfer.  The  secondary  proton  transferred 
from  the  -NH2  group  to  the  negatively  charged  oxygen  of 
the  -C(0“)(OH)H  group.  The  latter  group  existed  in  a  meta¬ 
stable  intermediate  on  the  HF/6-31G(d,p)  potential  (see  sup¬ 
plementary  material®^)  but  not  on  the  B3LYP/6-31G(d,p)  po¬ 
tential.  The  secondary  proton  transfer  required  the  -0“  as  an 
acceptor.  Relative  to  the  lowest  energy  conformation  of  ala¬ 
nine,  the  barrier  for  the  double  proton  transfer,  Sl/TS,  was 
52.3  kcal/mol  and  the  product  diol  was  21.0  kcal/mol.  This 
elementary  reaction  corresponds  to  the  first  segment,  SI,  of 
the  redesigned  Ca-C(0)OH  path,  as  shown  in  Fig.  6  and 
Table  II. 

The  double  proton  transfer  was  followed  by  the  rota¬ 
tion  of  one  of  the  geminal  -OH  groups  that  stabilized  the 
diol  by  0.2  kcal/mol.  The  barrier  for  this  rotation  was  only 
0.3  kcal/mol.  This  rotation  was  not  observed  at  the  HF/ 
6-31G(d,p)  level  and  added  a  separate  segment  to  the  Cq,- 
C(0)OH  path,  S2. 

From  the  geminal  diol  the  inversion  could  proceed  via 
two  different  paths,  path  1  and  path  2,  which  are  discussed 
below. 

b.  Ca-C(0)OH  path  1.  Segments  S3  and  S4  of  path  1 
(Fig.  6)  involved  an  oxiran-2-ol  intermediate,  as  shown  in 
Fig.  5.  This  path  was  originally  found  on  the  HF/6-31G(d,p) 
potential  energy  surface  (see  supplementary  material®"^),  and 
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FIG.  6.  Energy  profile  for  full  Ca-C(0)0H  path  1.  The  integral  energy  pro¬ 
file  reconstructed  from  the  forces  is  shown  in  solid  blue.  The  actual  energy 
points  offset  to  match  the  integral  profile  are  shown  as  red  plus  signs.  The 
elementary  segments  are  separated  by  vertical  lines  for  clarity.  Each  seg¬ 
ment  contains  a  single  TS  marked  by  a  black  “+”  and  covers  one  unit  of 
the  progress  variable. 

instead  of  the  geminal  diol,  HN=Cq.(CH3)-C(OH)2H  in¬ 
volved  its  zwitterionic  form,  H2N+=Ca(CH3)-C(0”)(0H)H. 
The  zwitterionic  form  was  not  stable  on  the  B3LYP/ 
6-31G(d,p)  potential  energy  surface. 

b.l.  Formation  of  oxiran-2-ol  from  the  geminal  diol.  Two 
oxiran-2-ol  intermediates  have  been  identihed,  as  described 
in  the  supplementary  material.®^  One  of  the  oxiran-2-ol  inter¬ 
mediates  could  be  used  to  complete  the  inversion,  as  schemat¬ 
ically  shown  in  Fig.  5.  Relative  to  the  lowest  conformation 
of  alanine,  the  energy  of  the  productive  oxiran-2-ol  was  33.9 
kcal/mol.  The  transition  state  S3/TS  to  form  the  oxiran-2- 
ol  was  53.9  kcal/mol  above  the  lowest  conformation  of  ala¬ 
nine.  Interestingly,  oxiran-2-ol  formation  was  also  coupled  to 
a  proton  transfer.  Specifically,  the  proton  from  one  of  the  - 
OH  groups  of  the  geminal  diol  transferred  back  to  the  =NH 
group.  This  reaction  corresponds  to  the  S3  segment  shown  in 
Fig.  6. 

b.2.  Fla  proton  transfer  back  to  the  Ca  atom.  Finally, 
we  optimized  proton  transfer  from  the  productive  oxiran-2-ol 
back  to  the  Cq.  atom,  effecting  the  inversion  in  the  S4  segment, 
as  shown  in  Fig.  6.  Relative  to  the  lowest  energy  conforma¬ 
tion  of  alanine,  the  energy  of  the  transition  state  for  the  final 
proton  transfer,  S4/TS,  shown  in  Fig.  6  was  55.8  kcal/mol. 


2-lmlnopropane-1 ,1-diol  2-lminopropane-1 ,1-diol 


FIG.  7.  Key  steps  of  Ca-C(0)0H  path  2.  The  proton  that  is  involved  in 
the  alanine  inversion  is  shown  in  bold  green. 

Considering  the  S4  reaction  segment  in  reverse,  it  becomes 
clear  that  the  oxiran-2-ol  can  be  formed  from  alanine  directly, 
bypassing  the  diol  intermediate.  However,  this  requires  a  ro¬ 
tation  of  the  protonated  carboxyl  group. 

b. 3.  Rotation  of  the  carboxyl  group.  The  carboxyl  group 
in  the  inverted  alanine  was  flipped  compared  with  the  orig¬ 
inal  alanine.  Indeed,  the  hydroxy  (-OH)  group  rather  than 
the  carbonyl  (=0)  group  interacted  with  the  -NH2  group  in 
the  inverted  alanine.  The  final  segmenf,  S5,  of  the  redesigned 
Ca-C(0)OH  path  1  shown  in  Fig.  6  is  the  rotation  of  the 
carboxyl  group.  The  original  conformation  of  alanine,  Sl/R, 
was  only  0.6  kcal/mol  lower  than  the  flipped  conformation, 
S5/R.  The  carboxyl  group  can  rotate  in  two  opposing  direc¬ 
tions.  The  lowest  barrier  for  the  carboxyl  group  rotation  from 
the  flipped  conformation,  S5/R,  toward  the  original  conforma¬ 
tion  S5/P,  as  identified  in  fhe  supplemenfary  material,®'^  was 
1.9  kcal/mol. 

From  Ck-C(0)OH  path  1,  we  conclude  that  the  ori¬ 
entation  of  the  carboxyl  -C(0)OH  group  determines  if  the 
Hq.  proton  transfer  will  result  in  geminal  diol  or  oxiran-2-ol 
formation. 

c.  Ca-C(0)OH  path!.  Because  the  formation  of  oxiran- 
2-ol  from  alanine  had  a  higher  barrier  than  that  of  the  geminal 
diol,  we  constructed  path  2,  which  bypasses  oxiran-2-ol,  as 
schematically  shown  in  Fig.  7.  Path  2  does  not  involve  any 
other  intermediates  besides  the  geminal  diol  but  requires  a  ro¬ 
tation  of  the  -C(0H)2H  group.  The  rotation  moves  the  proton 


TABLE  II.  Summary  of  Ct,-C(0)OH  path  1  for  alanine  inversion. “ 


Path  segment 

Er 

(kcal/mol) 

Ep 

(kcal/mol) 

Ets 

(kcal/mol) 

AEhfb-ts 

(kcal/mol) 

Imag  Ereqps 
(AFreqnFB-Ts)  (cm~') 

TS-HFB  RMSDah 
{RMSD.HMethyl)  (A) 

k 

(kcal/mol/ A^) 

M{K) 

SI 

0.0 

19.1 

50.5 

-0.10 

606  (29) 

0.070  (0.014) 

1400 

31  (16) 

S2 

19.1 

18.9 

19.5 

0.00 

205  (-9) 

0.011  (0.004) 

50 

10(6) 

S3 

18.9 

32.0 

52.0 

-0.03 

177  (-3) 

0.026  (0.012) 

1400 

55  (28) 

S4 

32.0 

0.6 

53.9 

-0.11 

344  (-64) 

0.103  (0.013) 

1000 

49  (25) 

S5 

0.6 

0.0 

2.5 

0.00 

28  (0) 

0.000  (0.000) 

70 

48  (24) 

^Er,  Ep,  and  Eps  are  B3LYP/6-31G(d,p)  energy  values  relative  to  the  original  conformation  of  alanine  (—323.754250  Hartree)  of  the  exact  reactant,  product,  and  TS,  respectively. 
AEfjFB-TS  is  the  energy  difference  between  the  TS  extracted  from  the  CG-HFB-optimized  path  and  the  exact  TS.  The  force  constant  {k)  is  different  for  different  segments.  The  M 
(K)  column  shows  the  total  number  of  beads  (M),  with  the  number  of  Fourier  terms  in  the  series  (K)  used  in  the  final  optimized  path  in  parentheses.  All  segments  were  optimized 
without  projecting  the  forces.  The  Imag  Freqjs  column  shows  the  exact  imaginary  frequency  values  for  the  TSs  and  their  differences  with  CG-HFB  values  in  parentheses.  The  TS-HFB 
RMSD/^ii  column  shows  all-atom  root  mean  square  values,  with  RMSD^tiMethyi  values  excluding  the  hydrogen  atoms  of  the  methyl  groups  in  pai'entheses. 
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FIG.  8.  Energy  profile  for  full  Ca-C(0)0H  path  2.  The  integral  energy  pro¬ 
file  reconstructed  from  the  forces  is  shown  in  solid  blue.  The  actual  energy 
points  offset  to  match  the  integral  profile  are  shown  as  red  plus  signs.  The 
elementary  segments  are  separated  by  vertical  lines  for  clarity.  Each  seg¬ 
ment  contains  a  single  TS  marked  by  a  black  “+”  and  covers  one  unit  of 
the  progress  variable. 

to  the  other  side  of  the  prochiral  plane  of  Co,  from  which  the 
proton  transfers  back  to  €„ ,  concluding  the  inversion. 

The  energy  prohle  along  path  2  is  shown  in  Fig.  8.  The 
first  two  segments  of  path  2  are  the  same  as  in  path  1.  Seg¬ 
ments  S4  and  S2  as  well  as  S5  and  SI  are  identical  after  inver¬ 
sion.  Just  like  the  carboxyl  group,  the  -C(OH)2H  group  can 
rotate  in  two  opposing  directions.  The  lowest  energy  rotation 
of  the  geminal  -C(OH)2H  group  was  identified  from  the  ex¬ 
ploratory  360°  rotation  path.  The  exploratory  path  contained 
eight  elementary  segments,  as  shown  in  the  supplementary 
material.®"^  The  lowest  energy  path  for  the  desired  rotation  was 
identihed  as  a  single  segment  with  a  barrier  of  2.7  kcal/mol 
and  was  used  as  Seg3  of  Ca,-C(0)0H  path  2.  Path  2  was  sym¬ 
metric  and  involved  five  segments.  The  barrier  S5/TS  to  form¬ 
ing  the  D-alanine  from  the  geminal  diol  was  31.3  kcal/mol. 
This  transition  state  was  52.3  kcal/mol  above  the  lowest 
energy  conformation  of  alanine. 

3.  The  Ca-CHs  path 

It  is  worth  noting  that  the  Ca-CHs  path,  unlike  the 
NFI2  and  Ca-C(0)0H  paths,  lacked  the  ability  to  transfer  the 
proton  from  to  the  -CH3  group  because  the  latter  could  not 
accept  additional  protons.  Instead,  optimization  of  the  path 
found  hydrogen  elimination  that  split  alanine  into  H2  and  2- 
amino-acrylic  acid.  These  two  molecules  were  28.8  kcal/mol 
above  the  lowest  conformation  of  alanine.  Flence,  this  in¬ 
version  path  involved  a  bimolecular  addition  of  H2  on  the 
other  side  of  the  prochiral  acrylic  plane.  However,  pathways 
involving  H2  molecule  elimination  and  addition  presented 
challenges  on  the  B3LYP/6-31G(d,p)  potential.  In  particular, 
unlike  the  restricted  HF/6-31G(d,p)  potential,  the  restricted 
B3LYP/6-31G(d,p)  potential  had  a  cusp  at  ~102  kcal/mol 
instead  of  a  transition  state. 

Based  on  the  energy  profiles  of  the  analyzed  paths,  we 
concluded  that  Cc-C(0)0H  path  1  and  path  2  were  more 
favorable  than  the  Cq.-NH2  and  Cc-CHs  paths.  Of  the  two 


Cq.-C(0)OH  paths,  path  2,  which  bypassed  oxiran-2-ol,  ap¬ 
peared  to  be  the  most  favorable.  The  highest  transition  state 
energy  along  C|3;-C(0)0H path  2  was  52.3  kcal/mol  above  the 
lowest  conformation  of  alanine  identified  in  this  work. 

B.  Acylation  by  peptide 

The  alanine  inversion  reaction  demonstrated  that  the 
combined  CG-HFB  method  could  be  used  to  study  in¬ 
tramolecular  reactions  with  rearrangements  of  covalent 
bonds.  Nevertheless,  the  experience  with  the  Co,-CH3  path 
suggested  that  bimolecular  reactions  could  be  more  difficult 
to  model.  The  preferred  relative  orientation  of  two  reacting 
molecules  is  well  defined  when  poised  for  the  attack  at  the 
transitions  state  but  becomes  less  defined  with  increasing  sep¬ 
aration.  In  the  end,  two  molecules  reacting  with  each  other 
and/or  their  products  may  separate  considerably  to  minimize 
their  interactions. 

In  this  section,  we  studied  a  bimolecular  reaction  while 
testing  the  limits  of  the  combined  CG-HFB  method.  Specifi¬ 
cally,  we  studied  alcohol  acylation  by  a  peptide,  as  shown  in 
Fig.  2.  This  reaction  is  biologically  important  as  a  model  of 
peptide  bond  cleavage.  When  catalyzed  by  enzymes,  the  re¬ 
action  proceeds  through  a  tetrahedral  intermediate.^^’ How¬ 
ever,  here,  we  studied  a  single-step  acylation  in  the  gas  phase 
where  oxygen  and  hydrogen  atoms  of  the  alcohol  added  to 
carbon  and  nitrogen  atoms  of  the  peptide  bond,  respectively. 
Figure  2  shows  the  two  bonds  that  broke  and  two  bonds  that 
formed  in  this  reaction. 

We  studied  the  acylation  reaction  with  the  combined  CG- 
HFB  method  on  both  QM  and  ONIOM  QM/MM  potential 
energy  surfaces.  In  particular,  we  compared  reaction  paths  for 
the  “small”  (methanol  -|-  formamide)  and  “large”  (methanol 
+  V-methylacetamide)  systems  shown  in  Fig.  2. 

In  Secs.  Ill  B  l-III  B  2,  we  will  first  discuss  the  QM 
results  on  the  small  and  large  systems.  In  Sec.  Ill  B  3,  we 
will  discuss  the  ONIOM  QM/MM  results  on  the  large  system 
and  compare  them  to  the  QM  results.  In  the  final  Sec.  Ill  B  4, 
we  discuss  the  relevance  and  transferability  of  the  ONIOM 
QM/MM  reaction  path  optimization  with  CG-HFB  to  studies 
of  much  larger  systems,  such  as  enzymes. 

1.  Small  system  QM  results 

The  combined  CG-HFB  path  optimization  performed  ex¬ 
ceptionally  well  in  locating  acylation  transition  states  for  the 
small  system.  The  reaction  began  with  the  OH  bond  of  the 
alcohol  aligning  with  the  C-N  bond  of  the  amide.  First,  the 
proton  from  the  OH  group  transferred  to  the  amide  nitrogen 
of  the  C-N  bond.  Next,  the  C-N  bond  broke  and  the  O-C 
bond  formed. 

Overall,  the  reaction  was  2.5  kcal/mol  exothermic  at  the 
HF/6-31G(d,p)  potential.  The  energy  difference  between  the 
exact  and  extracted  acylation  transition  states  in  the  small  sys¬ 
tem  was  only  0.02  kcal/mol,  as  shown  in  Table  III.  Structural 
comparison  of  the  exact  and  extracted  transition  states  for  the 
small  system  yielded  RMSDs  of  0.059  and  0.008  A,  includ¬ 
ing  all  atoms  and  excluding  the  hydrogen  atoms  of  the  methyl 
groups,  respectively. 
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TABLE  III.  Summary  of  optimized  reaction  paths  for  alcohol  acylation  by  peptide  mimics.^ 


Potential 

System 

Ets 

(Hartree) 

^EhfB-TS 

(kcal/mol) 

Er 

(kcal/mol) 

Ep 

(kcal/mol) 

Imag  Freqjs 
(AFreqHFB-Ts)  (cm~') 

TS-HFB  RMSD  AH 
(RMSD.HMelhyl)  (A) 

M{K) 

QM' 

Large 

-364.203028 

0.13 

-39.9 

-39.4 

913  (-12) 

0.089  (0.019) 

65  (33) 

QM^ 

Large 

-361.973508 

0.01 

-61.7 

-61.7 

701  (10) 

0.024  (0.014) 

193 (97) 

QM^ 

Small 

-283.891124 

0.02 

-63.0 

-65.5 

1343  (-3) 

0.059  (0.008) 

193 (97) 

QM^/MM,  R-Inc 

Large 

-283.894698 

0.01 

-63.8 

-65.5 

1332  (-1) 

0.033  (0.007) 

193 (97) 

QM^/MM,  R-Exc 

Large 

-283.888147 

0.01 

-62.6 

-65.5 

1294  (3) 

0.025  (0.005) 

193 (97) 

QM^/MM,  P-Inc 

Large 

-283.888020 

0.03 

-61.9 

-65.5 

1435  (5) 

0.057  (0.015) 

193 (97) 

QM^/MM,  P-Exc 

Large 

-283.888244 

0.06 

-63.3 

-65.3 

1295  (59) 

0.069  (0.009) 

193 (97) 

^Ejs  and  AE^fB-TS  values  show  the  absolute  energy  of  the  exact  TS  and  its  difference  with  the  CG-HFB  TS,  respectively.  E^  and  Ep  values  are  the  energies  of  the  reactant  and  product 
relative  to  the  exact  TS,  respectively.  QM^  and  QM^  are  B3LYP  and  HF  with  the  6-31G(d,p)  basis  set,  respectively.  QM^/MM  is  ONIOM[QM^;  AMBER].  ONIOM  calculations  were 
done  using  MM  parameters  of  the  reactant  with  the  transforming  bonds  included  in  (R-Inc)  and  excluded  (R-Exc)  from  the  connectivity  table  and,  similarly,  for  the  product  parameters 
(P-Inc  and  P-Exc).  The  MM  partial  charges  of  all  the  atoms  were  updated  on  the  fly  using  the  charge  equilibration  method.  TS-HFB  RMSD/^n  values  compare  structures  of  the  exact 
and  CG-HFB  TSs.  RMSD.HMethyi  excludes  the  hydrogen  atoms  of  the  methyl  groups  of  the  peptide  model.  The  force  constant  {k)  for  all  the  paths  was  1000  kcal/mol/A^.  Forces  were 
projected  during  optimization  for  all  the  potentials  except  QM^ .  The  Imag  Freqjs  column  shows  the  exact  imaginary  frequency  values  for  the  TSs  and  their  differences  with  CG-HFB 
values  in  parentheses.  The  M  {K)  column  shows  the  total  number  of  beads  (A/),  with  the  number  of  Fourier  terms  in  the  series  (AT)  used  in  the  final  optimized  path  in  parentheses. 


2.  Large  system  QM  results 

Figure  9  shows  the  B3LYP/6-31G(d,p)  energy  profile  for 
methanol  acylation  in  the  large  system.  The  reaction  mecha¬ 
nism  was  the  same  as  in  the  small  system.  The  harrier  height 
for  the  forward  reaction  was  39.9  kcal/mol. 

Full  QM  CG-HFB  path  optimization  also  gave  accurate 
transition  state  structures  for  the  large  system.  As  shown  in 
Table  III,  deviations  from  the  exact  transition  state  at  the 
B3LYP/6-31G(d,p)  potential  was  0.089  and  0.019  A,  con¬ 
sidering  all  atom  RMSD  and  excluding  methyl  hydrogen 
atoms,  respectively.  Correspondingly,  at  the  HF/6-31G(d,p) 
potential,  the  deviations  were  0.024  and  0.014  A  RMSDs. 
CG-HFB  overestimated  the  energy  of  the  transition  state  hy 
0.1  kcal/mol  for  both  QM  potentials. 

In  contrast  to  the  small  system,  the  reaction  was  en¬ 
ergy  neutral  at  the  HF/6-31G(d,p)  potential  and  0.4  kcal/mol 
endothermic  at  the  B3LYP/6-31G(d,p)  potential.  The  HF/ 
6-31G(d,p)  value  of  the  imaginary  frequency  of  the  transi¬ 
tion  state  in  the  large  system  was  almost  two  times  smaller 
than  in  the  small  system  (70H  cm“^  vs  1,343/  cm“'),  which 


FIG.  9.  B3LYP/6-31G(d,p)  energy  profile  for  alcohol  acylation  by  peptide 
mimic  in  large  system.  The  integral  energy  profile  is  shown  in  solid  blue.  The 
actual  energy  points  offset  to  match  the  integral  profile  are  shown  as  red  plus 
signs.  The  TS  is  marked  by  a  black  “+”. 


is  in  line  with  the  curvature  of  the  peaks  in  the  corresponding 
energy  profiles  (see  supplementary  material®^).  The  B3LYP/ 
6-31G(d,p)  potential  gave  a  value  of  913/  cm“'  for  the 
imaginary  frequency  in  the  large  system. 

3.  Large  system  ONIOM  QM/MM  results 

Table  III  shows  the  energetics  and  RMSDs  of  ONIOM 
optimized  acylation  reaction  paths.  Depending  on  the  pres¬ 
ence  of  the  bonded  terms  associated  with  covalent  bonds  un¬ 
dergoing  rearrangement  and  on  which  atom  types  and  vdW 
parameters  were  used,  the  ONIOM  transition  state  energy 
varied  by,  at  most,  1.8  kcal/mol.  With  all  ONIOM  schemes, 
the  reaction  was  exothermic  by  1.3-3. 6  kcal/mol,  which  en¬ 
compasses  the  2.5  kcal/mol  value  in  the  small  QM  case.  The 
imaginary  frequencies  at  the  ONIOM  transition  states  com¬ 
puted  with  all  bonded  terms  were  higher  than  those  where 
the  MM  potential  excluded  the  bonded  terms  of  the  affected 
covalent  bonds.  Also,  the  computed  imaginary  frequencies  at 
the  ONIOM  transition  states  were  closer  to  that  of  the  small 
model  QM  result. 

4.  Transferability  to  ONIOM  OM/MM  studies 
of  large  systems 

a.  Rugged  potential  energy  landscape.  In  general,  large 
systems  such  as  enzymes  have  many  “soft”  degrees  of  free¬ 
dom.  Consequently,  their  multidimensional  potential  energy 
surfaces  are  extremely  rugged  and  present  serious  challenges 
for  optimization. 

For  the  chain  of  states  path  optimization  methods,  the 
ruggedness  of  the  QM/MM  potential  energy  surfaces  can 
cause  the  beads  to  diverge  in  their  soft  degrees  of  freedom 
during  the  optimization,  rendering  the  corresponding  energy 
profiles  discontinuous.  This  issue  was  addressed  by  imposing 
positional  restraints  on  the  soft  degrees  of  freedom.^^  Dur¬ 
ing  the  path  optimization,  the  force  constants  of  the  restraints 
were  gradually  reduced  while  keeping  their  anchoring  posi¬ 
tions  unchanged. 

The  combined  CG-HFB  method  provides  an  alternative 
solution  to  the  problem  associated  with  the  ruggedness  of 
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potential  energy  landscapes.  As  was  mentioned  earlier,  the 
adaptive  positional  restraints  in  CG-HFB  transform  the 
rugged  QM/MM  potential  energy  surface  into  a  funnel-like 
surface  that  is  simpler  to  optimize.^®’ Thus,  CG-HFB  up¬ 
dates  the  anchoring  positions  but  not  the  force  constants  of 
the  restraints  during  the  path  optimization.  With  sufficiently 
stiff  restraints  CG-HFB  completely  prevents  discontinuities 
in  the  soft  degrees  of  freedom  along  the  path. 

Although  the  “large”  system  in  the  present  study  was 
much  smaller  than  that  in  a  typical  QM/MM  study,  it  did  con¬ 
tain  soft  degrees  of  freedom  that  are  characteristic  of  the  latter 
and  could  cause  problems  during  optimization.  In  particular, 
rotations  of  the  three  methyl  groups  shown  in  Fig.  2  could 
disrupt  continuity  of  the  path  and  of  the  energy  profile.^’’ 

The  combined  CG-HFB  method  performed  well  with  re¬ 
spect  to  the  soft  degrees  of  freedom.  The  only  problem  that 
we  found  due  to  the  soft  degrees  of  freedom  was  the  occa¬ 
sional  excursions  of  the  path  to  sample  the  methyl  rotations 
in  the  early  stages  of  path  optimization  (see  supplementary 
material®"^).  However,  the  segments  involving  the  methyl  ro¬ 
tations  could  be  trivially  discarded  producing  clean  energy 
profiles  like  the  one  depicted  in  Fig.  9.  Therefore,  we  antici¬ 
pate  the  combined  CG-HFB  method  will  provide  high  quality 
QM/MM  reaction  paths  and  potential  energy  prohles  for  large 
systems. 

b.  Conformational  sampling.  Free  energy  profiles  or  po¬ 
tentials  of  mean  force  (PMFs)  along  the  path  are  preferred 
to  the  potential  energy  profiles  because  they  can  be  directly 
compared  with  experiment. This  is  particularly  true  of  the 
large  systems  with  rugged  energy  landscapes.  However,  com¬ 
puting  a  valid  reaction  PMF  requires  conformational  sam¬ 
pling  and  a  proper  choice  of  reaction  coordinate.  A  poor 
choice  of  reaction  coordinate  will  likely  yield  an  invalid  PMF. 
Although  the  HFB  method  was  developed  to  compute  PMFs 
in  large  systems,  it  is  impractical  to  compute  QM/MM  PMFs 
directly. 

Indirect,  more  affordable  approaches  have  been  devel¬ 
oped  to  estimate  QM/MM  PMFs.^*“®'  Importantly,  the  indi¬ 
rect  QM/MM  PMF  calculations  critically  depend  on  reaction 
paths  optimized  on  QM/MM  potential  energy  surfaces.^®"®' 
For  a  given  configuration  of  the  system,  the  paths  optimized 
with  the  combined  CG-HFB  method  should  give  the  best  es¬ 
timate  of  the  reaction  barrier  and  identify  the  best  reaction 
coordinate.  Subsequently,  these  could  be  used  to  obtain  a 
valid  QM/MM  PMF.  Therefore,  the  reaction  paths  optimized 
on  QM/MM  potential  energy  surfaces  are  prerequisite  to 
successful  QM/MM  PMF  calculations. 

IV.  CONCLUSIONS 

In  this  report  we  applied  a  combined  CG-HFB  reaction 
path  optimizations  method  to  study  chemical  reactions  on 
QM  and  QM/MM  potential  energy  surfaces.  With  the  help  of 
CG-HFB,  we  were  able  to  identify  nontrivial,  intramolecular 
paths  for  spontaneous  alanine  inversion  in  the  gas  phase 
starting  from  three  crude  intermediate  states.  The  most  fa¬ 
vorable  of  the  paths  proceeded  through  previously  unknown 


intermediates,  namely,  2-iminopropane-l,l-diol  and  3-amino- 
3-methyloxiran-2-ol.  Furthermore,  with  CG-HFB,  we  were 
able  to  optimize  bimolecular  reaction  paths  for  acylation  of  an 
alcohol  by  peptide  models  despite  the  poorly  dehned  relative 
orientation  of  the  reacting  molecules  and  their  products. 

In  all  cases,  CG-HFB  delivered  high-quality  transition 
states  that  were  in  excellent  agreement  with  the  exact  tran¬ 
sition  states.  In  the  worst  case,  CG-HFB  overestimated  a  tran¬ 
sition  state  energy  by  0.1  kcal/mol.  All  identihed  transition 
states  were  automatically  validated  by  CG-HFB  instead  of 
using  IRC  calculations.  The  results  demonstrate  that  the  com¬ 
bined  CG-HFB  approach  is  a  practical  tool  for  finding  accu¬ 
rate  transition  states  with  little  prior  knowledge  of  reaction 
mechanisms. 

We  anticipate  that  the  combined  CG-HFB  method  will  be 
directly  applicable  to  QM/MM  studies  of  chemical  reactions 
in  much  larger  systems,  such  as  enzymes.  Our  choices  of  CG 
instead  of  quasi-Newton  optimizer  and  of  the  Cartesian  in¬ 
stead  of  nonlinear  coordinates  were  made  keeping  these  large 
systems  in  mind.  Future  work  will  be  concerned  with  appli¬ 
cations  of  the  combined  CG-HFB  method  to  QM/MM  studies 
of  enzyme  catalysis  and  inhibition. 
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